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Bayesian highest posterior density (HPD) intervals can be estimated directly 
from simulations via empirical shortest intervals. Unfortunately, these can be 
noisy (that is, have a high Monte Carlo error). We derive an optimal weighting 
strategy using bootstrap and quadratic programming to obtain a more compu- 
tationally stable HPD, or in general, shortest probability interval (Spin). We 
prove the consistency of our method. Simulation studies on a range of theoret- 
ical and real-data examples, some with symmetric and some with asymmetric 
posterior densities, show that intervals constructed using Spin have better cov- 
erage (relative to the posterior distribution) and lower Monte Carlo error than 
empirical shortest intervals. We implement the new method in an R package 
(SPIn) so it can be routinely used in post-processing of Bayesian simulations. 
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1 Introduction 

It is standard practice to summarize Bayesian inferences via posterior intervals of 
specified coverage (for example, 50% and 95%) for parameters and other quanti- 
ties of interest. In the modern era in which posterior distributions are computed 
via simulation, we most commonly see central intervals: the 100(1— a)% central 
interval is defined by the ^ and 1—^ quantiles. Highest-posterior density (HPD) 
intervals (recommended, for example, in the classic book of Box and Tiao, 1973) 
are easily determined for models with closed-form distributions such as the nor- 
mal and gamma but are more difficult to compute from simulations. 

We would like to go back to the HPD, solving whatever computational prob- 
lems necessary to get it to work. Why? Because for an asymmetric distribution, 
the HPD interval can be a more reasonable summary than the central probability 
interval. Figure [T] shows these two types of intervals for three distributions: for 
symmetric densities (as shown in the left panel in Figure [T]) , central and HPD 
intervals coincide; whereas for the two examples of asymmetric densities (the 
middle and right panels in Figure [l]), HPDs are shorter than central intervals 
(in fact, the HPD is the shortest interval containing a specified probability). 
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Figure 1: Simple examples of central (black) and highest probability density 
(red) intervals. The intervals coincide for a symmetric distribution; otherwise 
the HPD interval is shorter. The three examples are a normal distribution, a 
gamma with shape parameter 3, and the marginal posterior density for a variance 
parameter in a hierarchical model. 

In particular, when the highest density occurs at the boundary (the right 
panel in Figure [T]) , we strongly prefer the shortest probability interval to the 
central interval; the HPD covers the highest density part of the distribution and 
also the mode. In such cases, central intervals can be much longer and have the 
awkward property at cutting off a narrow high-posterior slice that happens to 
be near the boundary, thus ruling out a part of the distribution that is actually 
strongly supported by the inference. 

One concern with highest posterior density intervals is that they depend on 
parameterization. For example, the left endpoint of the HPD in the right panel 
of Figure [T] is 0, but the interval on the logarithmic scale does not start at — oo. 
Interval estimation is always conditional on the purposes to which the estimate 
will be used. Beyond this, univariate summaries cannot completely capture 
multivariate relationships. Thus all this work is within the context of routine 
data analysis (e.g., Spiegelhalter et al., 1994, 2002) in which interval estimates 
are a useful way to summarize inferences about parameters and quantities of 
interest in a model in understandable parameterizations. We do not attempt a 
conclusive justification of HPD intervals here; we merely note that in the pre- 
simulation era such intervals were considered the standard, which suggests to 
us that the current preference for central intervals arises from computational 
reasons as much as anything else. 

For the goal of computing an HPD interval from posterior simulations, the 
most direct approach is the empirical shortest probability interval, the shortest 
interval of specified probability coverage based on the simulations (Chen and 
Shao, 1999). For example, to obtain a 95% interval from a posterior sample of 
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Figure 2: Lengths of 95% empirical prohabihty intervals from several simulations 
for each of three models. Each gray curve shows interval length as a function 
of the order statistic of the interval's lower endpoint; thus, the minimum value 
of the curve corresponds to the empirical shortest 95% interval. For the (sym- 
metric) normal example, the empirical shortest interval is typically close to the 
central interval (for example, with a sample of size 1000, it is typically near 
(a;(26), 2;(975))j. The gamma and eight-schools examples are skewed with a peak 
near the left of the distribution, hence the empirical shortest intervals are typ- 
ically at the left end of the scale. The red lines show the lengths of the true 
shortest 95% probability interval for each distribution. The empirical shortest 
interval approaches the true value as the number of simulation draws increases 
but is noisy when the number of simulation draws is small, hence motivating a 
more elaborate estimator. 
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size n, you can order the simulation draws and then take the shortest interval 
that contains 0.95n of the draws. This procedure is easy, fast, and simulation- 
consistent (that is, as n— t-oo it converges to the actual HPD interval assuming 
that the HPD interval exists and is unique) . The only trouble with the empirical 
shortest probability interval is that it can be too noisy, with a high Monte Carlo 
error (compared to the central probability interval) when computed from the 
equivalent of a small number of simulation draws. This is a concern with current 
Bayesian methods that rely on Markov chain Monte Carlo (MCMC) techniques, 
where for some problems the effective sample size of the posterior draws can be 
low (for example, hundreds of thousands of steps might be needed to obtain an 
effective sample size of 500). 

Figure [2] shows the lengths of the empirical shortest 95% intervals based 
on several simulations for the three distributions shown in Figure [T| starting 
from the kth order statistic. For each distribution and each specified number of 
independent simulation draws, we carried out 200 replications to get a sense of 
the typical size of the Monte Carlo error. The lengths of the 95% intervals are 
highly variable when the number of simulation draws is small. 

In this paper, we develop a quadratic programming strategy coupled with 
bootstrapping to estimate the endpoints of the shortest probability interval. 
Simulation studies show that our procedure, which we call Spin, results in more 
stable endpoint estimates compared to the empirical shortest interval (Figure 
[3]). Specifically, define the efficiency as 

MSE(empirical shortest interval) 
^^"^"^^ = MSE(Spin) ' 

so that an efficiency greater than 1 means that Spin is more efficient. We show 
in Figure [3] that, in all cases that we experimented on. Spin is more efficient than 
the competition. We derive our method in Section[2j apply it to some theoretical 
examples in Section [3] and in two real-data Bayesian analysis problems in Section 
[4| We have implemented our algorithm as SPIn, a publicly available package in 
R (R Development Core Team, 2011). 



2 Methods 

2.1 Problem setup 

Let Xi,. . . , Xn *~ F, where F is a continuous unimodal distribution. The goal 
is to estimate the 100(1 — a)% shortest probability interval for F. Denote the 
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Figure 3: EfRciency of Spin for 95% shortest intervals for the three distributions 
shown in Figure\^ For the eight-schools example, Spin is compared to a modified 
empirical HPD that includes the zero point in the simulations. The efficiency 
is always greater than 1, indicating that Spin always outperforms the empirical 
HPD. The jagged appearance of some of the lines may arise from discreteness 
in the order statistics for the 95% interval. 
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Figure 4: Notation for shortest probabihty intervals. 



true shortest probability interval by {l{a),u{a)). Define G = 1 — F, so that 
F{l{a)) + G{u{a)) = a. 

To estimate the interval, for < A < a, find A such that G^^{a — A) — 
F~^{A) is a minimum, i.e., 

A* = argmin^^p.^llG-H" "A) - ^-^(A)}. 
Taking the derivative, 
d 



dA 



[(1_F)-1(„_A)-F-HA)] = 0, 



we get 



/(G-i(«-A)) /(F-i(A)) °' 
where / is the probability density function of X. The minimum can only be 
attained at solutions to ([T]) , or A = or a (Figure |4]) . It can easily be shown 
that if f'{x) / a.e., the solution to ([T]) exists and is unique. Then 

l{a) = F-\A*), 
u{a) = G-\a-A*). 

Taking the lower end for example, we are interested in a weighting strategy 
such that / = Y17=i'^i^(i) (where Yl'^i ~ ^) minimum mean squared 

error (MSE), E ^| |^"^j^tt;jX(j) — /(a)||^y It can also be helpful to calculate 

MSE(Xq„^.])) = E (||^([„A*]) ~ K"^)!!^)- practice we estimate A* by A such 
that 



A = argmin^g.o,al{G'-'(« - A) - F-\A)}, 



(2) 
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where F represents the empirical distribution and G = 1 — F. This yields the 
widely used empirical shortest interval, which can have a high Monte Carlo 
error (as illustrated in Figure [2]). We will denote its endpoints by I* and u*. 
The corresponding MSE for the lower endpoint is E(||X/r ^i-, — /(a)|p). 



2.2 Quadratic programming 

Let I = Y17=i Then 

MSE(/) = ^{i-F-^{/\*)f 

= E(/ -Ef+E/ -F-i(A*))2 
= E(Z -E/y + (E/ -F-i(A*))2 
= Var + Bias^ , 

where E(Z) = Yd^i Wi^X(j^) and Var = YJl^i VarX(j)+2 Y.i<j WiWjCov{X(^i), X^j-j). 
It has been shown (e.g., David and Nag 2003) that 



E(X(,)) = Qi + 



Pi', 



-Q': + o{n-'), 



2(n + 2) 

where qi = 1 — pi, Q = F~^ is the quantile function, Qi = Q{pi) = Q(EC/(j)) 
Q(^),and Q'I = j%-y Thus 



i=l 



PiQi 



2(n + 2) 



(3) 



It has also been shown (e.g., David and Nagaraja, 2003) that 



VarX, 



cov(X(j),X(j)) 



—^Q. + o{n ) 

PiQj 
n + 2 



Q[Q'. + o{n 1), fori < j, 



where Q[ 



1 



dpi/dQi f(Qi) 



TTTT ifiQi) is called the density-quantile function). Thus, 



Var(0 = ^ 



.2 Pili yn/2 
n + 2 



i=l i<j 

Putting ([3]) and Q together yields, 



Qf + 2j2mw,P^Q'a^+oin-'). 



(4) 
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8 Simulation-efficient shortest probability intervals 

Finding the optimal weights that minimize MSE as defined in ([5| is then ap- 
proximately a quadratic programming problem. 

In this study we impose triangle kernels centered around the endpoints of the 
empirical shortest interval on the weights for computational stability. Specifi- 
cally, the estimate of the lower endpoint has the form, 

where i* is the index of the endpoint of the empirical shortest interval, b is the 
bandwidth in terms of data points, and Wi decreases linearly when i moves away 
from i* . We choose b to be of order ^/n in this study. This optimization problem 
is equivalent to minimizing MSE with the following constraints: 

i*+b/2 

Wi = 1 

i=i*-b/2 

Wi - Wi-l Wi-l - Wi-2 



Wi* - Wi*-l Wi* - Wi* + l 



for i = i*- 6/2+2, ...,i*,i* + 2,..., i*+b/2 



- X(^i*_i) X(^i*_^_i) - 

Wi*_b/2 > 
Wi*+b/2 > 

Wi* - Wi*+i > 0. (6) 

The above constraints refiect the piecewise linear and symmetric pattern of the 
kernel. In practice, Q, f, and A* can be substituted by the corresponding sample 
estimates Q, /, and A. 

The above quadratic programming problem can be rewritten in the conven- 
tional matrix form, 

MSE(f) = ^w'^'Dw - (fw, 

where 

W = iWi*_h/2, Wi*+b/2)'^, 

and D = (dij) is a symmetric matrix with 



n+2 



PiQj+QiQj), i<j, 
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= 2Q(A*)Q„ 

subject to 

A^w > Wo, 

with appropriate A and wq derived from the hnear constraints in 

2.3 Proof of simulation-consistency of the estimated HPD 

The following result ensures the simulation-consistency of our endpoint estima- 
tors when we use the empirical distribution and kernel density estimate. 

Under regularity conditions, with probability 1, 

—w "DnW — d^ w \ = min I —w Y)w — d w ] , 
2 J w \'2 I 



where D„ and d„ are empirical estimates of D and d based on empirical distri- 
bution function and kernel density estimates. 

To see this, we first show that D„ — )• D and — )• d uniformly as n — )• oo 
almost surely. By the Glivenko-Cantelli theorem, — i^||oo — )• 0, which implies 
Q Q almost surely, where denotes weak convergence, i.e., Q{t) — )• Q{t) at 
every t where Q is continuous (e.g., van der Vaart, 1998). It has also been shown 
that J'Ef{f{x)—f{x))'^dx = 0(n~^/^) under regularity conditions (see, e.g., van 
der Vaart, 1998), which implies that f{x) — t- /(x) almost surely for all x. The 
endpoints of the empirical shortest interval are simulation-consistent (Chen and 
Shao, 1999). 

The elements in matrix D„ result from simple arithmetic manipulations of 
Q and /, so dij — )• dij with probability 1, which implies, 

D„ —7- D uniformly and almost surely, 

given D is of finite dimension. We can prove the almost sure uniform convergence 
of dn to d in a similar manner. 

The optimization problem min^(^7i;-^D„it) — d^t/;) corresponds to calculating 
the smallest eigenvalue of an a,iigniente(i matrix constructGd. from D^i and dn- 
The above uniform convergence then imphes, 



lim mm[w DnW — d^w) = m.m[w Ttw — d w). 

n— >oo w 



The same proof works for the upper endpoint. 
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Figure 5: Bootstrapping procedure to get more stable weights. 



2.4 Bootstrapping the procedure to get a smoother estimate 

Results from quadratic programming as described above show that, as expected, 
Spin has a much reduced bias than the empirical shortest intervals. This is 
because the above procedure takes the shape of the empirical distribution into 
account. However, the variance remains at the same magnitude as that of the 



empirical shortest interval (as we shall see in the left panel in Figure 10), because 
the optimal weights derived from the empirical distribution are also subject 
to the same level of variability as the empirical shortest intervals. We can 
use the bootstrap (Efron, 1979) to smooth away some of this noise and thus 
further reduce the variance in the interval. Specifically, we bootstrap the original 
posterior draws B times (in this study we set B = 50) and calculate the Spin 
optimal weights for each of the bootstrapped samples. Here, we treat the weights 
as general functions of the posterior distribution under study rather than the 
endpoints of HPD interval of the posterior samples. We then compute the final 
weights as the average of the B sets of weights obtained from the above procedure 
(Figure [s]) . 



2.5 Bounded distributions 



As defined so far, our procedure necessarily yields an interval within the range 
of the simulations. This is undesirable if the distribution is bounded with the 
boundary included in the HPD interval (as in the right graph in Figure 1). To 
allow boundary estimates, we augment our simulations with a pseudo-datapoint 
(or two, if the distribution is known to be bounded on both sides). For example, 
if a distribution is defined on (0, oo) then we insert another datapoint at 0; if 
the probability space is (0, 1), we insert additional points at and 1. 
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2.6 Discrete and multimodal distributions 

If a distribution is continuous and unimodal, the highest posterior density region 
and shortest probabihty interval coincide. More generally, the highest posterior 
density region can be formed from disjoint intervals. For distributions with 
known boundary of disjoint parts, Spin can be applied to different regions sep- 
arately and a HPD region can be assembled using the derived disjoint intervals. 
When the nature of the underlying true distribution is unknown and the sample 
size is small, the inference of unimodality can be difficult. Therefore, in this 
paper, we have focused on estimating the shortest probability interval, recogniz- 
ing that, as with interval estimates in general, our procedure is less relevant for 
multimodal distributions. 

3 Results for simple theoretical examples 

We conduct simulation studies to evaluate the performance of our methods. We 
generate independent samples from the normal, t(5), and gamma(3) distribu- 
tions and construct 95% intervals using these samples. We consider sample sizes 
of 100, 300, 500, 1000 and 2000. For each setup, we generate 20,000 indepen- 
dent replicates and use these to compute root mean squared errors (RMSEs) 
for upper and lower endpoints. We also construct empirical shortest intervals 
as defined in ([2]), parametric intervals and central intervals for comparison. For 
parametric intervals, we calculate the sample mean and standard deviation. For 
the normal distribution, the interval takes the form of mean it 1.96 sd (for the 
t distribution we also implement the same form as "Gaussian approximation" 
for comparison); for the gamma, we use the mean and standard deviation to 
estimate its parameters first, and then numerically obtain the HPD interval us- 
ing the resulted density with the two estimates plugged in. The empirical 95% 
central interval is defined as the 2.5%th and 97.5%th percentiles of the sampled 
data points. We also use our methods to construct optimal central intervals (see 
Section [5]) for the two symmetric distributions. 

Figure [6] shows the intervals constructed for the standard normal distribu- 
tion and the t(5) distribution based on 500 simulation draws. The empirical 
shortest intervals tend to be too short in both cases, while Spins have better 
endpoint estimates. Empirical central intervals are more stable than empirical 
shortest intervals, and Spins have comparable RMSE for N(0, 1) and smaller 
RMSE for t(5). Our methods can further improve RMSE based on the empiri- 
cal central intervals as shown in the "central (QP)" row in Figure [6j The RMSE 
is the smallest if one specifies the correct parametric distribution and uses that 
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Figure 6: Spin for symmetric distributions: 95% intervals for the normal and 
t(5) distributions, in each case based on 500 independent draws. Each horizontal 
bar represents an interval from one simulation. The histograms of the lower 
ends and the upper ends are based on results from 20,000 simulations. The 
dotted vertical lines represent the true endpoints of the HPD intervals. Spin 
greatly outperforms the raw empirical shortest interval. The central interval 
(and its quadratic programming improvement ) does even better for the Gaussian 
but is worse for the t(5) and in any case does not generalize to asymmetric 
distributions. The intervals estimated by fitting a Gaussian distribution do the 
best for the normal model but are disastrous when the model is wrong. 



Ying Liu, Andrew Gelman, and Tian Zheng 



13 



information to construct interval estimates, while in practice the underlying dis- 
tribution is usually not totally known, and mis-specifying it can result in far-off 
intervals (the right bottom panel in Figure |6]). 

Figure [7| shows the empirical shortest. Spin, and parametric intervals con- 
structed from 500 samples of the gamma distribution with shape parameter 3. 
Spin gets more accurate endpoint estimates than empirical shortest intervals. 
Specifically, for the lower end where the density is relatively high. Spin esti- 
mates are less variable, and for the upper end at the tail of the density, Spin 
shows a smaller bias. Again, the lowest RMSE comes from taking advantage 
of the parametric form of the posterior distribution, which is rarely practical 
in real MCMC applications. Hence the RMSE using the parametric form rep- 
resents a rough lower bound on the Monte Carlo error in any HPD computed 
from simulations. 

Figure [S] shows the intervals constructed for MCMC normal samples. Specif- 
ically, the Gibbs sampler is used to draw samples from a standard bivariate 
normal distribution with correlation 0.9. We use this example to explore how 
Spin works on simulations with high autocorrelation. Two chains each with 1000 
samples are drawn with Gibbs sampling. For one variable, every ten draws are 
recorded for Spin construction, resulting in 200 samples, which is roughly the 
level of the effective sample size in this case. This is a typical senario in practice 
when MCMC techniques are adopted for multivariate distributions. Again Spin 
greatly outperforms the empirical shortest interval in case of highly correlated 
draws. 

We further investigate coverage probabilities of the different intervals con- 
structed (Figure [9]) . Empirical shortest intervals have the lowest coverage prob- 
ability, which is as expected since they are biased towards shorter intervals (see 
Figure [g] and Figure [T]) . Coverage probabilities of Spin are closer to the nominal 
coverage (95%) for both normal and gamma distributions. Comparable coverage 
is observed for central intervals. As expected, parametric intervals represent a 
gold standard and have the most accurate coverage. 



Figure 10 shows the bias-variance decomposition of different interval esti- 
mates for normal and gamma distributions under sample sizes 100, 300, 500, 
1,000 and 2,000. We average lower and upper ends for the normal case due 
to symmetry. For both distributions. Spin has both well-reduced variance and 
bias compared to the empirical shortest intervals. The upper end estimates of 
empirical central intervals for the gamma have a large variance since the cor- 
responding density is low so the observed simulations in this region are more 
variable. It is worth pointing out that the computational time for Spin is negli- 
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Figure 7: Spin for an asymmetric distribution. 95% intervals for the gamma 
distributions with shape parameter 3, as estimated from 500 independent draws. 
Each horizontal bar represents an interval from one simulation. The histograms 
are based on results from 20,000 simulations. The dotted vertical lines represent 
the true endpoints of the HPD interval. Spin outperforms the empirical shortest 
interval. The interval obtained from a parametric fit is even better but this 
approach cannot be applied in general; rather, it represents an optimality bound 
for any method. 
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Normal from Gibbs Sampler 

RMSE(lb)= 0.316 RMSE(u b) = 0.317 
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Figure 8: Spin for MCMC samples. 95% intervals for normal samples from Gibhs 
sampler, in each case based on 200 draws. Each horizontal bar represents an 
interval from one simulation. The histograms are based on results from 20,000 
simulations. The dotted vertical lines represent the true endpoints of the HPD 
intervals. Spin greatly outperforms the raw empirical shortest interval. The 
central interval (and its quadratic programming improvement ) does even better. 
Again the intervals estimated by fitting a Gaussian distribution do the best. 
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Figure 9: Distribution of coverage probabilities for Spin and other 95% intervals 
calculated based on 500 simulations for the normal and gamma(3) distributions. 
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Figure 10: Bias-variance decomposition for 95% intervals for normal and 
gamma(3) examples, as a function of the number of simulation draws. Because 
of the symmetry of the normal distribution, we averaged its errors for upper and 
lower endpoints. Results from Spin without bootstrap are shown for normal for 
description purpose. 
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gible compared to sampling, thus it is a more efficient way to obtain improved 



interval estimates. In the normal example shown in the left panel in Figure 10 
rather than increasing the sample size from 300 to 500 to reduce error, one can 
spend less time to compute Spin with the 300 samples and get a even better 
interval. 

We also carried out experiments with even bigger samples and intervals of 
other coverages (90% and 50%), and got similar results. Spin beats the empirical 
shortest interval in RMSE (which makes sense, given that Spin is optimizing over 
a class of estimators that includes the empirical shortest as a special case). 



4 Results for two real-data examples 

In this section, we apply our methods to two applied examples of hierarchical 
Bayesian models, one from education and one from sociology. In the first exam- 
ple, we show the advantages of Spin over central and empirical shortest intervals; 
in the second example, we demonstrate the routine use of Spin to summarize 
posterior inferences. 

Our first example is a Bayesian analysis from Rubin (1980) of a hierarchi- 
cal model of data from a set of experiments performed on eight schools. The 
group-level scale parameter (which corresponds to the between-school standard 
deviation of the underlying treatment effects) has a posterior distribution that 
is asymmetric with a mode at zero (as shown in the right panel of Figure [T]) . 
Central probability intervals for this scale parameter (as presented, for example, 
in the analysis of these data by Gelman et al., 1995) are unsatisfying in that 
they exclude a small segment near zero where the posterior distribution is in 



fact largest. Figure 11 shows the 95% empirical shortest intervals and Spin con- 
structed from 500 draws. The results of empirical shortest intervals for 8 schools 
are from including the zero point in the simulations. Spin has smaller RMSE 



than both empirical shortest and central intervals (Figure 1 1 and Figure [12 



For our final example, we fit the social network model of Zheng et al. (2006) 
using MCMC and construct 95% Spins for the overdispersion parameters based 
on 200 posterior draws. The posterior is asymmetric and bounded below at 1. 



Figure 13 is a partial replot of Figure 4 from Zheng et al. (2006) with Spins 
added. For this type of asymmetric posterior we prefer the estimated HPDs to 
the corresponding central intervals as HPDs more precisely capture the values 
of the parameter that are supported by the posterior distribution. 
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RMSE(lb)= 0.022 



RMSE(ub)= 0.946 
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Figure 11: Spin for the group-level standard deviation parameter in the eight 
schools example, as estimated from 500 independent draws from the posterior 
distribution (which is the right density curve in Figure [I| a distribution that 
is constrained to be nonnegative and has a minimum at zero). The histograms 
in this figure are based on results from 20,000 simulations. The dotted vertical 
lines represent the true endpoints of the HPD interval as calculated numerically 
from the posterior density. Spin does better than the empirical shortest interval, 
especially at the left end, where its smoothing tends to (correctly) pull the lower 
bound of the interval all the way to the boundary at 0. 
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Figure 12: Bias-variance decomposition for 95% intervals for the eight-school 
example, as a function of the number of simulation draws. 



5 Discussion 

We have presented a novel optimal approach for constructing reduced-error 
shortest probability intervals (Spin). Simulation studies and real data exam- 
ples show the advantage of Spin over the empirical shortest interval. Another 
commonly used interval estimate in Bayesian inference is the central interval. 
For symmetric distributions, central intervals and HPDs are the same; otherwise 
we agree with Box and Tiao (1973) that the HPD is generally preferable to the 
central interval as an inferential summary (Figure [T]) . In our examples we have 
found that for symmetric distributions Spin and empirical central intervals have 



comparable RMSEs and coverage probabilities (Figures rol^ and 10). Therefore 



we recommend Spin as a default procedure for computing HPD intervals from 
simulations, as it is as computationally stable as the central intervals which are 
currently standard in practice. 



We set the bandwidth parameter 6 in ([6|) to ^/n, which seems to work well 
for a variety of distributions. We also carried out sensitivity analysis by varying 
b and found that large b tends to result in more stable endpoint estimates where 
the density is relatively high but can lead to noisy estimates where the density 
is low. This makes sense: in low-density regions, adding more points to the 
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Figure 13: 95% central intervals (black lines) and Spins (red lines) for the 
overdispersion parameters in the "How many X's do you know?" study. The pa- 
rameter in each row is a measure of the social clustering of a certain group in the 
general population: groups of people identihed by first names have low overdis- 
persion and are close to randomly distributed in the social network, whereas 
categories such as airline pilots or American Indians are more overdispersed 
(that is, non-randomly distributed). We prefer the Spins as providing better 
summaries of these highly skewed posterior distributions. However, the differ- 
ences between central intervals and Spins are not large; our real point here is not 
that the Spins are much better but that they will work just fine in routine applied 
Bayesian practice, satisfying the same needs as were served by central intervals 
but without that annoying behavior when distributions are highly asymmetric. 
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weighted average may introduce noise instead of true signals. Based on our 
experiments, we believe the default value b = \/n is a safe general choice. 

Our approach can be considered more generally as a method of using weighted 
averages of order statistics to construct optimal interval estimates. One can 
replace Q(A*) in ([5]) by the endpoints of any reasonable empirical interval es- 
timates, and obtain improved intervals by using our quadratic programming 
strategy (such as the improved central intervals shown in Figure [6]) . 

One concern that arises is the computational cost of performing Spin itself. 
Our simulations show Spin intervals to have better simulation coverage and ap- 
preciably lower mean squared error compared to the empirical HPD, but for 
simple problems in which one can quickly draw direct posterior simulations, it 
could be simpler to forget Spin and instead just double the size of the posterior 
sample. More and more, though, we find ourselves computing Bayesian mod- 
els using elaborate Markov chain simulation algorithms for which it can take 
hundreds of thousands of steps, and hours or even days of computing time, to 
obtain an effective sample size of a few hundred posterior simulation draws. 
In this case, the Spin calculations are a small price to pay for obtaining more 
accurate and stable HPD intervals. 

We have demonstrated that our Spin procedure works well in a range of the- 
oretical and applied problems, that it is simulation-consistent, computationally 
feasible, addresses the boundary problem, and is optimal within a certain class 
of procedures that include the empirical shortest interval as a special case. We 
do not claim, however, that the procedure is optimal in any universal sense. We 
see the key contribution of the present paper as developing a practical procedure 
to compute shortest probability intervals from simulation in a way that is supe- 
rior to the naive approach and is competitive (in terms of simulation variability) 
with central probability intervals. Now that Spin can be computed routinely, 
we anticipate further research improvements on posterior summaries. 
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